Dataset statistics
| Number of variables | 7 |
|---|---|
| Number of observations | 660 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 36.2 KiB |
| Average record size in memory | 56.2 B |
Variable types
| NUM | 7 |
|---|
Reproduction
| Analysis started | 2020-11-29 22:22:02.498119 |
|---|---|
| Analysis finished | 2020-11-29 22:22:10.382035 |
| Duration | 7.88 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
Sl_No has unique values | Unique |
Total_visits_bank has 100 (15.2%) zeros | Zeros |
Total_visits_online has 144 (21.8%) zeros | Zeros |
Total_calls_made has 97 (14.7%) zeros | Zeros |
| Distinct count | 660 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 330.5 |
|---|---|
| Minimum | 1 |
| Maximum | 660 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 33.95 |
| Q1 | 165.75 |
| median | 330.5 |
| Q3 | 495.25 |
| 95-th percentile | 627.05 |
| Maximum | 660 |
| Range | 659 |
| Interquartile range (IQR) | 329.5 |
Descriptive statistics
| Standard deviation | 190.6698718 |
|---|---|
| Coefficient of variation (CV) | 0.576913379 |
| Kurtosis | -1.2 |
| Mean | 330.5 |
| Median Absolute Deviation (MAD) | 165 |
| Skewness | 0 |
| Sum | 218130 |
| Variance | 36355 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 660 | 1 | 0.2% | |
| 226 | 1 | 0.2% | |
| 224 | 1 | 0.2% | |
| 223 | 1 | 0.2% | |
| 222 | 1 | 0.2% | |
| 221 | 1 | 0.2% | |
| 220 | 1 | 0.2% | |
| 219 | 1 | 0.2% | |
| 218 | 1 | 0.2% | |
| 217 | 1 | 0.2% | |
| Other values (650) | 650 | 98.5% |
| Value | Count | Frequency (%) | |
| 1 | 1 | 0.2% | |
| 2 | 1 | 0.2% | |
| 3 | 1 | 0.2% | |
| 4 | 1 | 0.2% | |
| 5 | 1 | 0.2% |
| Value | Count | Frequency (%) | |
| 660 | 1 | 0.2% | |
| 659 | 1 | 0.2% | |
| 658 | 1 | 0.2% | |
| 657 | 1 | 0.2% | |
| 656 | 1 | 0.2% |
Customer Key
Real number (ℝ≥0)
| Distinct count | 655 |
|---|---|
| Unique (%) | 99.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 55141.44393939394 |
|---|---|
| Minimum | 11265 |
| Maximum | 99843 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.2 KiB |
Quantile statistics
| Minimum | 11265 |
|---|---|
| 5-th percentile | 15317.6 |
| Q1 | 33825.25 |
| median | 53874.5 |
| Q3 | 77202.5 |
| 95-th percentile | 96301.45 |
| Maximum | 99843 |
| Range | 88578 |
| Interquartile range (IQR) | 43377.25 |
Descriptive statistics
| Standard deviation | 25627.7722 |
|---|---|
| Coefficient of variation (CV) | 0.4647642566 |
| Kurtosis | -1.147577576 |
| Mean | 55141.44394 |
| Median Absolute Deviation (MAD) | 21533 |
| Skewness | 0.0514619906 |
| Sum | 36393353 |
| Variance | 656782707.9 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 47437 | 2 | 0.3% | |
| 37252 | 2 | 0.3% | |
| 97935 | 2 | 0.3% | |
| 96929 | 2 | 0.3% | |
| 50706 | 2 | 0.3% | |
| 75775 | 1 | 0.2% | |
| 43679 | 1 | 0.2% | |
| 33295 | 1 | 0.2% | |
| 67911 | 1 | 0.2% | |
| 94529 | 1 | 0.2% | |
| Other values (645) | 645 | 97.7% |
| Value | Count | Frequency (%) | |
| 11265 | 1 | 0.2% | |
| 11398 | 1 | 0.2% | |
| 11412 | 1 | 0.2% | |
| 11466 | 1 | 0.2% | |
| 11562 | 1 | 0.2% |
| Value | Count | Frequency (%) | |
| 99843 | 1 | 0.2% | |
| 99596 | 1 | 0.2% | |
| 99589 | 1 | 0.2% | |
| 99473 | 1 | 0.2% | |
| 99437 | 1 | 0.2% |
Avg_Credit_Limit
Real number (ℝ≥0)
| Distinct count | 110 |
|---|---|
| Unique (%) | 16.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 34574.242424242424 |
|---|---|
| Minimum | 3000 |
| Maximum | 200000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.2 KiB |
Quantile statistics
| Minimum | 3000 |
|---|---|
| 5-th percentile | 6000 |
| Q1 | 10000 |
| median | 18000 |
| Q3 | 48000 |
| 95-th percentile | 121100 |
| Maximum | 200000 |
| Range | 197000 |
| Interquartile range (IQR) | 38000 |
Descriptive statistics
| Standard deviation | 37625.4878 |
|---|---|
| Coefficient of variation (CV) | 1.088251981 |
| Kurtosis | 5.133842332 |
| Mean | 34574.24242 |
| Median Absolute Deviation (MAD) | 11000 |
| Skewness | 2.202395623 |
| Sum | 22819000 |
| Variance | 1415677333 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 8000 | 35 | 5.3% | |
| 6000 | 31 | 4.7% | |
| 9000 | 28 | 4.2% | |
| 13000 | 28 | 4.2% | |
| 10000 | 26 | 3.9% | |
| 19000 | 26 | 3.9% | |
| 7000 | 24 | 3.6% | |
| 11000 | 24 | 3.6% | |
| 18000 | 23 | 3.5% | |
| 14000 | 23 | 3.5% | |
| Other values (100) | 392 | 59.4% |
| Value | Count | Frequency (%) | |
| 3000 | 1 | 0.2% | |
| 5000 | 21 | 3.2% | |
| 6000 | 31 | 4.7% | |
| 7000 | 24 | 3.6% | |
| 8000 | 35 | 5.3% |
| Value | Count | Frequency (%) | |
| 200000 | 1 | 0.2% | |
| 195000 | 2 | 0.3% | |
| 187000 | 1 | 0.2% | |
| 186000 | 1 | 0.2% | |
| 184000 | 1 | 0.2% |
Total_Credit_Cards
Real number (ℝ≥0)
| Distinct count | 10 |
|---|---|
| Unique (%) | 1.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.706060606060606 |
|---|---|
| Minimum | 1 |
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 5 |
| Q3 | 6 |
| 95-th percentile | 8 |
| Maximum | 10 |
| Range | 9 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.16783486 |
|---|---|
| Coefficient of variation (CV) | 0.4606474589 |
| Kurtosis | -0.3697703016 |
| Mean | 4.706060606 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.1448789903 |
| Sum | 3106 |
| Variance | 4.699507978 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 4 | 151 | 22.9% | |
| 6 | 117 | 17.7% | |
| 7 | 101 | 15.3% | |
| 5 | 74 | 11.2% | |
| 2 | 64 | 9.7% | |
| 1 | 59 | 8.9% | |
| 3 | 53 | 8.0% | |
| 10 | 19 | 2.9% | |
| 9 | 11 | 1.7% | |
| 8 | 11 | 1.7% |
| Value | Count | Frequency (%) | |
| 1 | 59 | 8.9% | |
| 2 | 64 | 9.7% | |
| 3 | 53 | 8.0% | |
| 4 | 151 | 22.9% | |
| 5 | 74 | 11.2% |
| Value | Count | Frequency (%) | |
| 10 | 19 | 2.9% | |
| 9 | 11 | 1.7% | |
| 8 | 11 | 1.7% | |
| 7 | 101 | 15.3% | |
| 6 | 117 | 17.7% |
| Distinct count | 6 |
|---|---|
| Unique (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.403030303030303 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 100 |
| Zeros (%) | 15.2% |
| Memory size | 5.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 2 |
| Q3 | 4 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.631812876 |
|---|---|
| Coefficient of variation (CV) | 0.6790646267 |
| Kurtosis | -1.104274131 |
| Mean | 2.403030303 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.1418960148 |
| Sum | 1586 |
| Variance | 2.662813262 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 2 | 158 | 23.9% | |
| 1 | 112 | 17.0% | |
| 3 | 100 | 15.2% | |
| 0 | 100 | 15.2% | |
| 5 | 98 | 14.8% | |
| 4 | 92 | 13.9% |
| Value | Count | Frequency (%) | |
| 0 | 100 | 15.2% | |
| 1 | 112 | 17.0% | |
| 2 | 158 | 23.9% | |
| 3 | 100 | 15.2% | |
| 4 | 92 | 13.9% |
| Value | Count | Frequency (%) | |
| 5 | 98 | 14.8% | |
| 4 | 92 | 13.9% | |
| 3 | 100 | 15.2% | |
| 2 | 158 | 23.9% | |
| 1 | 112 | 17.0% |
| Distinct count | 16 |
|---|---|
| Unique (%) | 2.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.606060606060606 |
|---|---|
| Minimum | 0 |
| Maximum | 15 |
| Zeros | 144 |
| Zeros (%) | 21.8% |
| Memory size | 5.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 2 |
| Q3 | 4 |
| 95-th percentile | 9.05 |
| Maximum | 15 |
| Range | 15 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.93572412 |
|---|---|
| Coefficient of variation (CV) | 1.12649879 |
| Kurtosis | 5.739571572 |
| Mean | 2.606060606 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 2.225606714 |
| Sum | 1720 |
| Variance | 8.618476112 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 2 | 189 | 28.6% | |
| 0 | 144 | 21.8% | |
| 1 | 109 | 16.5% | |
| 4 | 69 | 10.5% | |
| 5 | 54 | 8.2% | |
| 3 | 44 | 6.7% | |
| 15 | 10 | 1.5% | |
| 7 | 7 | 1.1% | |
| 12 | 6 | 0.9% | |
| 10 | 6 | 0.9% | |
| Other values (6) | 22 | 3.3% |
| Value | Count | Frequency (%) | |
| 0 | 144 | 21.8% | |
| 1 | 109 | 16.5% | |
| 2 | 189 | 28.6% | |
| 3 | 44 | 6.7% | |
| 4 | 69 | 10.5% |
| Value | Count | Frequency (%) | |
| 15 | 10 | 1.5% | |
| 14 | 1 | 0.2% | |
| 13 | 5 | 0.8% | |
| 12 | 6 | 0.9% | |
| 11 | 5 | 0.8% |
| Distinct count | 11 |
|---|---|
| Unique (%) | 1.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.5833333333333335 |
|---|---|
| Minimum | 0 |
| Maximum | 10 |
| Zeros | 97 |
| Zeros (%) | 14.7% |
| Memory size | 5.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 9 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 2.865316818 |
|---|---|
| Coefficient of variation (CV) | 0.7996232979 |
| Kurtosis | -0.5182644359 |
| Mean | 3.583333333 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.6589053024 |
| Sum | 2365 |
| Variance | 8.210040465 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 4 | 108 | 16.4% | |
| 0 | 97 | 14.7% | |
| 2 | 91 | 13.8% | |
| 1 | 90 | 13.6% | |
| 3 | 83 | 12.6% | |
| 6 | 39 | 5.9% | |
| 7 | 35 | 5.3% | |
| 9 | 32 | 4.8% | |
| 8 | 30 | 4.5% | |
| 5 | 29 | 4.4% |
| Value | Count | Frequency (%) | |
| 0 | 97 | 14.7% | |
| 1 | 90 | 13.6% | |
| 2 | 91 | 13.8% | |
| 3 | 83 | 12.6% | |
| 4 | 108 | 16.4% |
| Value | Count | Frequency (%) | |
| 10 | 26 | 3.9% | |
| 9 | 32 | 4.8% | |
| 8 | 30 | 4.5% | |
| 7 | 35 | 5.3% | |
| 6 | 39 | 5.9% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| Sl_No | Customer Key | Avg_Credit_Limit | Total_Credit_Cards | Total_visits_bank | Total_visits_online | Total_calls_made | |
|---|---|---|---|---|---|---|---|
| 0 | 1 | 87073 | 100000 | 2 | 1 | 1 | 0 |
| 1 | 2 | 38414 | 50000 | 3 | 0 | 10 | 9 |
| 2 | 3 | 17341 | 50000 | 7 | 1 | 3 | 4 |
| 3 | 4 | 40496 | 30000 | 5 | 1 | 1 | 4 |
| 4 | 5 | 47437 | 100000 | 6 | 0 | 12 | 3 |
| 5 | 6 | 58634 | 20000 | 3 | 0 | 1 | 8 |
| 6 | 7 | 48370 | 100000 | 5 | 0 | 11 | 2 |
| 7 | 8 | 37376 | 15000 | 3 | 0 | 1 | 1 |
| 8 | 9 | 82490 | 5000 | 2 | 0 | 2 | 2 |
| 9 | 10 | 44770 | 3000 | 4 | 0 | 1 | 7 |
Last rows
| Sl_No | Customer Key | Avg_Credit_Limit | Total_Credit_Cards | Total_visits_bank | Total_visits_online | Total_calls_made | |
|---|---|---|---|---|---|---|---|
| 650 | 651 | 78996 | 195000 | 10 | 1 | 12 | 2 |
| 651 | 652 | 78404 | 132000 | 9 | 1 | 12 | 2 |
| 652 | 653 | 28525 | 156000 | 8 | 1 | 8 | 0 |
| 653 | 654 | 51826 | 95000 | 10 | 0 | 15 | 1 |
| 654 | 655 | 65750 | 172000 | 10 | 1 | 9 | 1 |
| 655 | 656 | 51108 | 99000 | 10 | 1 | 10 | 0 |
| 656 | 657 | 60732 | 84000 | 10 | 1 | 13 | 2 |
| 657 | 658 | 53834 | 145000 | 8 | 1 | 9 | 1 |
| 658 | 659 | 80655 | 172000 | 10 | 1 | 15 | 0 |
| 659 | 660 | 80150 | 167000 | 9 | 0 | 12 | 2 |